Abstract
We discuss the use of Bayesian networks as robust probabilistic models of the multivariate statistical dependencies among interacting variables in transcriptional regulatory networks. We explain how principled scores can be computed to compare network models with one another in terms of their ability to explain observed data simply. With principled scores, we can automatically learn static or dynamic network models that provide simple explanations for a variety of high-throughput data. We make a case for, and demonstrate the utility of, informative priors over network structures and parameters: informative priors can be used to incorporate different kinds of data into the learning process, and also to guide the learning process toward network models that exhibit greater biological plausibility. Results from both simulated and experimental data illustrate the benefits of this modeling framework.
Introduction
Proteins are the primary molecular workhorses of the cell, playing significant roles in metabolism, biosynthesis and degradation, transport, homeostasis, structure and scaffolding, motility, sensing, signaling and signal transduction, replication, and repair. However, one of the most intriguing roles for proteins is that of transcriptional regulation: control of precisely which genes are being transcribed into RNA at any given time. Since ribosomes subsequently translate most of this RNA into protein, proteins are in large part responsible for regulating their own existence. Although much has been learned about the large network of molecular interactions that regulate transcription, it would probably be fair to say that far more still remains to be learned.
Discovering and understanding the operation of large transcriptional regulatory networks is clearly an important problem in both molecular and synthetic biology.